An Effective Bypass Mechanism to Enhance Branch Predictor for SMT Processors
نویسندگان
چکیده
Unlike traditional superscalar processors, Simultaneous Multithreaded processor can explore both instruction level parallelism and thread level parallelism at the same time. With a same fetch width, SMT fetches instructions from a single thread not so deeply as in traditional superscalar processor. Meanwhile, all the instructions from different threads share the same Function Unites in SMT. All the characteristics make it possible to enhance the performance of SMT through reducing the branch mis-prediction. Based on the fact that about 15% of branch instructions whose directions can be definitely known at predicting cycle, a simple and effective bypass mechanism is proposed. This scheme doesn’t depend on any existed branch predictor, and can be used as an effective enhancement to them. Execution-driven simulation results show that the branch prediction miss rates of our predictor decrease by more than 15% on average compared with a simple base line (g-share) predictor and improve the instruction throughput by about 2.5%.
منابع مشابه
A latency-conscious SMT branch prediction architecture
Executing multiple threads has proved to be an effective solution to partially hide latencies that appear in a processor. When a thread is stalled because a long-latency operation is being processed, like a memory access or a floatingpoint calculation, the processor can switch to another context so that another thread can take advantage of the idle resources. However, fetch stall conditions cau...
متن کاملTolerating Branch Predictor Latency on SMT
Simultaneous Multithreading (SMT) tolerates latency by executing instructions from multiple threads. If a thread is stalled, resources can be used by other threads. However, fetch stall conditions caused by multi-cycle branch predictors prevent SMT to achieve all its potential performance, since the flow of fetched instructions is halted. This paper proposes and evaluates solutions to deal with...
متن کاملExploring branch target buffer access filtering for low-energy and high-performance microarchitectures
Powerful branch predictors along with a large branch target buffer (BTB) are employed in superscalar and simultaneous multi-threading (SMT) processors for instruction-level parallelism and thread-level parallelism exploitation. However, the large BTB not only dominates the predictor energy consumption, but also becomes a major roadblock in achieving faster clock frequencies at deep sub-micron t...
متن کاملBuilding an SMT Application Simulator
It also requires examination of various processor and architecture design decisions. SMT processors may exhibit different cache, branch-prediction, and utilization patterns than conventional processors [10, 9]. While studies of several of these factors have been undertaken, there are many more variables to be examined; each component found on a conventional chip may behave differently when seve...
متن کاملEvaluating Branch Predictors on an SMT Processor
Simultaneous multithreading (SMT) provides significant increases in microprocessor throughput by issuing instructions from multiple threads per clock cycle. SMT can be realized in a wide-issue superscalar with a modest increase in resources, because much of the hardware is shared among the multiple thread contexts. Branch prediction accuracy, a key component of microprocessor performance, can s...
متن کامل